Marginal Distribution

Marginal Distribution

Definition

The marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset.

p(x)=yp(x,y)=yp(x|y)p(y)

which corresponds to the case where we marginalizes y, i.e. to throw away y and restrict the tension to x

p(x)=yp(x,y)dy=yp(x|y)p(y)dy

Application of marginalisation

p(x|z)=yp(x,y|z)=yp(x|y,z)p(y|z) p(x|z)=yp(x|y,z)p(y|z)dy

when x and z are independent:

p(x|z)=yp(x|y)p(y|z)dy

Predictive Distribution

For parameter θ and observed data Y, both prior p(θ) and posterior p(θ|Y) distributions focus on the parameter θ. If you want to predictive new data y^, you need predictive distribution of y^ using prior information (Prior Predictive Distribution) or observed data (Posterior Predictive Distribution).

Prior Predictive Distribution

with marginalisation of prior

p(y)=p(y|θ)p(θ)dθ

Posterior Predictive Distribution

p(y^|Y)=p(y^|θ,Y)p(θ|Y)dθ

if we assume new data y^ are independent from Y:

p(y^|Y)=p(y^|θ)p(θ|Y)dθ